AN EXTENSION OF WIENER INTEGRATION WITH THE USE 
OF OPERATOR THEORY 
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Abstract. With the use of tensor product of Hilbert space, and a diagonal- 
ization procedure from operator theory, we derive an approximation formula 
for a general class of stochastic integrals. Further we establish a generalized 
Fourier expansion for these stochastic integrals. In our extension, we circum- 
vent some of the limitations of the more widely used stochastic integral due to 
Wiener and Ito, i.e., stochastic integration with respect to Brownian motion. 
Finally we discuss the connection between the two approaches, as well as a 
priori estimates and applications. 
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1. Introduction 

Recently there has been increase in the number of applications of stochastic inte- 
gration and stochastic differential equations (SDEs). In addition to the traditional 
applications in physics and dynamics, stochastic processes have found uses in such 
areas as option pricing in finance, filtering in signal processing, computations bi- 
ological models. This fact suggests a need for a widening of the more traditional 
approach centered about Brownian motion B(t) and Wiener's integral. 

Since SDEs are solved with the use of stochastic integrals, we will focus here on 
integration with respect to a wider class of stochastic processes than has previously 
been considered. In evaluation of a stochastic integral we deal with the term dB(t) 
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by making use of the basic properties of Brownian motion, such as the fact that 
B(t) has independent increments. If instead X(t) is an arbitrary stochastic process, 
it is not at all clear how to make precise a stochastic integration with respect to 
dX(t). We will develop a method, based on a Karhunen-Loeve diagonalization, for 
doing precisely that. 

The theory of stochastic integrals is well developed, see e.g., |Kuo06( HM65J. 
For many applications, such as the solutions to stochastic differential equations 
in physics and finance, it is important to have tools for evaluating integrals with 
respect to dB where B is Brownian motion. The reason for the technical issues 
involved in the computation of stochastic integrals can be understood this way: A 
naive approach runs into difficulties, for example because the length of Brownian 
paths is infinite, and because Brownian paths are discontinous (with probability 
one) . Wiener and Ito offered a, by now, well known way around this difficulty. The 
idea of Wiener in fact is operator theoretic: It is to establish the value of an integral 
as a limit that takes place in a Hilbert space of random variables. This is successful 
because of the existence of an isometry between this Hilbert space on the one hand 
and a standard L 2 — Lebesgue space on the other. 

In this paper we extend this operator theoretic approach to a much wider class 
of stochastic integrals, i.e., integration with respect to dX where X belongs to a 
rather general class of stochastic processes. And we give some applications. 

In the proof of our theorem we make use of a result from two earlier papers 
[JS071 IJS08j by the coauthors. The idea is again operator theoretic, and it is 
based on an application of von Neumann's spectral theorem to an integral operator 
directly associated with the process X under consideration. 

While the applications of stochastic integrals to physics (e.g., [BC97) . and their 
interplay with operator theory (e.g., |AL08aj )are manifold, the idea of exploring and 
extending the scope in the present direction appears to be new. The need for such 
an extension is convincing: For example, physical disturbances or perturbations will 
typically take you outside the particular path-space framework where the theory 
was initially developed. 

Earlier approaches to stochastic integrals with reproducing kernels include [AL08a, 
IAALQ81 |AL08b] and operat or theory [JM08J: and papers exploring physical ramifi- 
cations: [BC971 IBDSG+07I IHudf)7bl IHud07aj . Although the papers cited here with 
physics applications represent only the tip of an iceberg! 

2. Notation and Definitions 

To make precise the operator theoretic tools going into our construction, we must 
first introduce the ambient Hilbert spaces. Since stochastic integrals take values 
in a space of random variables, we must specify a fixed probability space Q, with 
sigma algebra and probability measure. In the case of Brownian motion, the proba- 
bility space amounts to the standard construction of Wiener and Kolmogorov: The 
essential axiom in that case is that all finite samples are jointly Gaussian, but we 
will consider general stochastic processes, and so we will not make these restricting 
assumptions on the sample distributions and on the underlying probability space. 
For more details on this case, see section [4] 

The kind of integrals we consider presently are stated precisely in Definition 
12.11 cq (|2.5p below. I particular, initially we consider only functions of time in the 
integrant, so f(t)dX t . When the stochastic process X is given, we show (Theorem 
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13. ip that the corresponding integrals live in a Hilbert space which is a direct sum 
of standard Lebesgue Hilbert spaces carrying the function f(t). In the case of 
Brownian motion, we show (Example 14. ip that the direct sum representation then 
only has one term. 

We now list the symbols and the terminology. 

• L 2 : an L 2 -space. 

• L 2 (R): all L 2 -functions on K. 

• J C M. a finite closed interval. 

• L 2 (J): L 2 with respect to the Lebesgue measure restricted to J. 

• (ft, S, P): a fixed probability space. 

• f2: sample space. 

• S: some sigma algebra of subsets of £2. 

• P: a probability measure defined on S. 

• L 2 (J x fi, m x P): the £ 2 -space on J x fl with respect to product measure 
to x P where m denotes Lebesgue measure. 

(1) X{ : f2 — *• R, i € R a stochastic process, 

(2) X t (u) := X(t, w),t£l,wel. 

(3) E(-) :— J n -dP: the expectation with respect to (O, <S, P). 
Restricting Assumptions: 

(i) X £ L 2 ( J x fi, to x P) for all finite intervals Jcl. 

(ii) (s, t) i — > E{X s X t ) is continuous on J x J. (E(X t ) = 0). 

(iii) For all J, and all s € J, the function, 

(2.1) i .— » £?(Jf,Jf t ) 

is of bounded variation. 
For J C M fixed, we consider partitions 

(2.2) 7r:t < t x < ••• < t„_i < t„ =: i, J= [t 0) i]; 
and we set 

(2.3) | vt" | := max(ti + i — t{), and Atj := ti+i — ti, < i < n. 

i 

If / : R — *■ C is continuous, we set 

n-l 



(2.4) ^^/(tiXJr^ -A* 



1=0 



Definition 2.1. By a stochastic integral, we mean a limit 

(2.5) lim S*(f,X) =: f f(s)dX s 

We now turn to questions of existence of this limit for a rather general family of 
stochastic processes X t ; see (i)-(iii) below. 

3. Statement of the Main Theorem 

When the stochastic process X is given, we proved in Theorem 13.11 that the 
corresponding integrals live in a Hilbert space which is a direct sum of standard 
Lebesgue Hilbert spaces carrying the function f(i). In the case of Brownian motion, 
we now show that the direct sum representation then only has one term. Yet the 
method from section[3]still offers a Fourier decomposition of the Wiener integration. 



Theorem 3.1. Let (0,5, P) be given as above, and let X be a stochastic process 
satisfying conditions (i) — (Hi). Let f be given and continous. 

(a) Then the stochastic integral f t f(s)dX s exists and is in L 2 (Q 7 P). 

(b) There is a family of bounded variation functions ipi, ip2,- ■ • and numbers Ai, A2, • 
satisfying the following conditions: 

Ai > A 2 > ■ • • > 0, A fc -► 

in fact ^2 k Xk < 00, such that 

pt 2 00 t 2 



(3.1) E{ 



f(s)dX s 



) = ^2 X k / f{s)d(p k (s 
k=l Jto 



where the terms J t f(s)difk(s) refer to Stieltjes integration, 
(c) If an interval J is chosen such that to and t € J , and if f has a weak derivative 
f in L 2 (J), then the following estimate holds for the RHS in h3.1\ ): 

(3.2) RHS < "Const" + Ai / \f'{s)\ 2 ds 

J t Q 

where "Const" depends on certain boundary conditions, and where X\ is the 
maximal eigenvaluel see |5.^[j below. 



In the next corollary, we stay with the assumptions from the theorem; in partic- 
ular X t is a stochastic process subject to conditions (i)-(iii), and a compact interval 
J is fixed. 

Corollary 3.2. Covariance relations: 

• E(X s X t ) = X)fcLi \k^k{s)fk{t) 

• E(x?) = Y.ZiMvk(t)\ 2 

• Dependency of increments: If s < t < u in J , then 

E{{X t -X s ){X u -X t )) 

00 

= ^2 \k(<fk(t)tpk(u) - Lpk(s)lfik(u) + lfik(s)(fk(t) - \(fik(t)\ 2 ) 
k=l 

• E((X t+At - Xt) 2 ) = J2Zi Afcbfc(* + At) - ip k it)\ 2 

4. An Application 

In this section we restrict the setting of Theorem 13.11 to the special case when 
X = B, i.e., to the special case of integration with respect to Brownian motion. 
We then work out the eigenfunctions and eigenvalues for the covariance operator. 
It turns out to be the familiar Fourier basis. Actually there is a choice of bases 
depending on boundary conditions. A choice of the Dirichlct conditions yields the 
ONB of the sine functions. We further show that when the eigenvalue expansion is 
summed (using orthogonality) we then arrive at the familiar Wiener-Ito formula. 

Example 4.1. X = B = Brownian motion. 

• (fi, S, P) Gaussian space; 

• a space of functions, S := (cylinder sets) a— algebra; the sigma algebra 
generated by the cylinder-sets. 

• J =[0,1]; 
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• E{X a X t ) = min(s, t) =: s A t; 

• X t (to) = cj(t), uj eft; 

. E((X t+At - X t f) = At. 

We now show that the known formula 

,t 2 



(4.1) 



f( S )dB 6 



ii 



) = \m\ A da 



follows from the theorem; and in particular from (|3.1[) . 

In the case of Brownian motion for the functions <fk we may take 

(4.2) tp k (t) = V2sm(kTTt); fc = l,2, 
Note that 

(4.3) <p k (0) = <p k (l) = 0, V*=l,2,...; 
and 

1 



(4.4) 



A fc = 



Set to = for simplicity. Note that if 



(4.5) 
then 

so the eigenvalue problem 
(4.6) 



9{t) 



(fcTr) 2 

t A sf(s)ds, 
-fit), 



t A sf(s)ds = Xf(t) 



has the solution given by (14. 2p and (14.4)) . Note further that (|4.3|) is a choice of 
boundary conditions. 

To see that (|4.1j) follows from (|3.ip in the theorem, we proceed as follows; starting 
with the RHS in (|3.1|) and using d(s'm(kirt)) = — kir cos(kivt)dt: 



« 2 

f(s)d(p k (s) 



oo 

= 2V 



f(s)dsm(kTrs) 

o 

t 2 
/(s) cos(kirs)ds 

o 



l/WI 2 ^; 



byParseval' s formula Jq 

and the desired conclusion follows. 

5. Proof of Theorem 13.11 

Here we give the details of proof of theorem 13. II Since the proof is long, to help 
the reader our presentation is divided into two parts, A and B. 

Part A is an outline of the steps in the proof itself, and part B contains the 
details arguments making up each part in the proof. Part A begins with the nota- 
tion and the terminology, introducing an auxiliary selfadjoint operator, its matrix 
approximations, and its spectral resolution. 
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5.1. Part A. 

• Select a fixed interval J :— [a, b], a < b. 

• From the assumptions on the process (X t ) i6 R note that the operator 
(5.1) (T,f)(t) := J E(X t X 3 )f(s)ds 

is compact and selfadjoint in the Hilbert space L 2 {J). 

• For every partition 

7r : to < ti < ■ ■ ■ < t n , t a = a, t n = b; 
the following matrix 
(5-2) (Mj^ij :=E{X u X tj ) 

offers a discrete approximation for the operator Tj in (|5.ip . 

• Set 

(5.3) H(J) := L 2 (J) e ker(Tj) = {g e L 2 (J) : (g, k) L 2 = 0, Vfc G ker(Tj)}. 

Then an application of the spectral theorem to Tj yields the following 
sequence of orthogonal eigenfunctions tpi,cp2,'" in H(J), and numbers 
Ai, A 2 , • • • £ R+ such that — > 0; 

Ai > A 2 > • • ■ > 0, A fc -» 

such that 

(5.4) Tjip k = X k (p k fc = l,2,--- 
orthogonality relations in the t— domain: 

(5-5) (<pj, ipk)v(j) = J^TpJtpkdm = 5j, k ; 

and the closed span of {<fk} is H(J). 

• Set 

(5.6) Z k (-) := -L /" fa$)X t (-)dt, 

and note that each Zk, k = 1,2, - • • is a random variable, 

^e£ 2 (n,5,P). 

Moreover, a calculation yields (orthogonality relations in the w— domain: 

(5.7) £(^Z fc ) = 6j, k 

• Aside; note that if (Xt) is assumed Gaussian, then each Zk, k = 1, 2, • • • is 
Gaussian as well. 

• Karhunen-Loeve, or Generalized Fourier Expansion: 

In L 2 (J x f2, m x P), we have the following pointwise a. e. representation 

oo 

(5.8) X(t,w) = ^yfa<p k (t)Z k (u), 

fe=i 

as well as 



(5.9) lim 



fc=l 



= 0. 

L 2 (mxP) 



5.2. Part B. We now turn to the details of the proof of (pH)) and (pTTjl . 
Writing out equation (|5.4p . we get 



(5.10) / E(X t X s )<p k (s)ds = \ k <p k {t); 



and so from the assumptions (i)-(iii) and eq. (|2.1j) we conclude that each of the 
eigenfunctions ipk(-) is continuous and of bounded variation. 

This means that whenever tg,t S J, i.e., a < to < t < b, the expression 

(5.11) f f(a)cbpk(s) 

J t 

is a well defined Stieltjes integral. Morever, if / is assumed of bounded variation, 
(5-12) f fd Vk = [f9k]\ ~ f <p k {s)f'{a)d8. 

We now turn to the approximation (|2.4[) from the theorem, and we use the 
Karhunen-Loeve expansion (|5.8[) in the computation of 

(5-13) AX U := X u+1 - X u 



for a fixed (chosen) partition 7r as specified in (|2.2p . 

Using condition (3) (ii) in the statement of the theorem, we note that for fixed J, 
the operator Tj in (|5.1|) is trace-class. From operator theory (Mercer's theorem), 
we know that 

/oo 
E(X?)dt = ^ A fc < oo 
fc=i 

i.e., integration in (|5.1|) over the diagonal s — t. And so in particular, finiteness of 
J2T=i follows. 

In the study of the operator Tj from l|5.ip we make use of tools from Hilbert 
space theory of integral operators. In particular, in the estimate (15.14[) we use 
Mercer's theorem. However in applications to covariance kernels (|2.ip one often 
has stronger properties. It is known that if the kernel in (|2.1[) is Lipschitz of 
degree 7 with 7 > ^ in one of the two variables (with the other fixed), then the 
operator Tj in (|5.1[) will automatically be nuclear. For the literature on this we 
refer to [Dos93 , Kiih83 , LL52 , Sti58 . Wc further note that this Lipschitz condition 
is indeed satisfied for the covariance kernel of fractional Brownian motion, see e.g., 
[IA071 . 

Set (Atpk)ti '■— <fk(t + At) — <pk(t). Using the Hilbert space L 2 (J x fi,m x P) 
and its tensor-product representation, L 2 (J) ® L 2 (£l,S, P), we get 

n—l n—1 00 

t=o fa y J5JSI i=0 fe= i 



2(Z)/(*i)( A Vfc)*,)>/^^C 



k=l i=0 
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and therefore 
(5.15) 



e(\sm,x)\ 2 ) ^,E A * 



i=0 



Since / is assumed contions, and each cp^ is of bounded variation, the following 
convergence holds: 

n-i „t 
(5.16) lim y)/(t i )(A ¥ >fc) tl = / fdcp k . 



Now if the function / is satisfying / e L 2 {J) and /' S L 2 (J), then we get the 
following estimate, relying on the boundary representation (|5.12|) and Parseval, see 
also (|5.5|) : For the RHS in (|5 . 15[) we have; after passing to the limit: 

2 oo „+ 2 



fe=i 



f(s)dip k (s) 



= (boundary terms) + > Aj, / <pk(s)f'(s)ds 

by El £T| Jto 

00 ft 2 

< (boundary terms) + Ai / ipkf'ds 

fe=i 

< (boundary terms) + Ai / |/'(s)| 2 ds 

by | |5.5[ and Parseval J to 



which is the desired conclustion in part (c) of the theorem. 



Proof. Of Corollary 13.21 The essential point is formular (|5.8p . However, in substi- 
tution of the expression on the RHS in (|5.8|) we make use of double-orthogonality, 
i.e., (|5.5p and (|5.7p . Specifically, we have the tensor product representation L 2 (J x 
fi, m x P) = L 2 (J) ® i 2 (^), and soI = J]fcli VAfe^fe ® Z k in (JOJ) refers to the 
tensor representation. □ 

6. Entropy: Optimal Bases 

In this section we compare the choice of ONB from section [3] with alternative 
choices of ONBs. The application of Karhunen-Loeve dictates a particular choice 
of ONB. 

Historically, the Karhunen-Loeve arose as a tool from the interface of probability 
theory and information theory; see details with references inside the paper. It has 
served as a powerful tool in a variety of applications; starting with the problem 
of separating variables in stochastic processes, say X t ; processes that arise from 
statistical noise, for example from fractional Brownian motion. Since the initial in- 
ception in mathematical statistics, the operator algebraic contents of the arguments 
have crystallized as follows: starting from the process X t , for simplicity assume zero 
mean, i.e., E(X t ) — 0; create a correlation matrix Tj(s,t) — E(X s X t ). (Strictly 
speaking, it is not a matrix, but rather an integral kernel. Nonetheless, the matrix 
terminology has stuck.) The next key analytic step in the Karhunen-Loeve method 
is to then apply the Spectral Theorem from operator theory to a corresponding self- 
adjoint operator, or to some operator naturally associated with the integral kernel: 
Hence the name, the Karhunen-Loeve Decomposition (KLC). In favorable cases 
(discrete spectrum), an orthogonal family of functions (ip n (t)) in the time variable 
arise, and a corresponding family of eigenvalues. We take them to be normalized 



in a suitably chosen square-norm. By integrating the basis functions tp n (i) against 
X t , we get a sequence of random variables Z n . It was the insight of Karhunen- 
Loeve |Loe52] to give general conditions for when this sequene of random variables 
is independent, and to show that if the initial random process X t is Gaussian, then 
so are the random variables Z n . 

Below, we take advantage of the fact that Hilbert space and operator theory 
form the common language of both quantum mechanics and of signal/image pro- 
cessing. Recall first that in quantum mechanics, (pure) states as mathematical 
entities "are" one-dimensional subspaces in complex Hilbert space TI, so we may 
represent them by vectors of norm one. Observables "are" selfadjoint operators in 
TI, and the measurement problem entails von Neumann's spectral theorem applied 
to the operators. 

In signal processing, time-series, or matrices of pixel numbers may similarly be 
realized by vectors in Hilbert space TI. The probability distribution of quantum me- 
chanical observables (state space Ti) may be represented by choices of orthonormal 
bases (ONBs) in TL in the usual way (see e.g., |Jor06| ).In the 1940s, Kari Karhunen 
( |Kar46j . |Kar52j ) pioneered the use of spectral theoretic methods in the analysis 
of time series, and more generally in stochastic processes. It was followed up by 
papers and books by Michel Loeve in the 1950s |Loe52] . and in 1965 by R.B. Ash 
[Ash90j . (Note that this theory precedes the surge in the interest in wavelet bases!) 

Parallel problems in quantum mechanics and in signal processing entail the choice 
of "good" orthonormal bases (ONBs). One particular such ONB goes under the 
name "the Karhunen-Loeve basis." We will show that it is optimal. 

Definition 6.1. Let TL be a Hilbert space. Let (ipi) and (cpi) be orthonormal 
bases (ONB). If is an ONB, we set Q n := the orthogonal projection onto 

span{ipi, ...,tp n }- 

We now introduce a few facts about operators which will be needed in the paper. 
In particular we recall Dirac's terminology |Dir47] for rank-one operators in Hilbert 
space. While there are alternative notation available, Dirac's bra-ket terminology 
is especially efficient for our present considerations. 

Definition 6.2. Let vectors u, v € TL. Then 

(6.1) ( u \ v ) = inner product G C, 

(6.2) \u) (v\ — rank-one operator, TL — » TL, 
where the operator \u)(v\ acts as follows 

(6.3) \u)(v\w = \u)(v\w) — (v\w)u, for all w G TL. 

Dirac's bra-ket and ket-bra notation is is popular in physics, and it is especially 
convenient in working with rank-one operators and inner products. For example, 
in the middle term in eq (|6.3|) . the vector u is multiplied by a scalar, the inner 
product; and the inner product comes about by just merging the two vectors. 

Definition 6.3. If S and T are bounded operators in TL, in B(TL), then 



(6.4) 



S\u)(v\T= \Su)(T*v\ 
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If (V'i)ieN is an ONB then the projection 

Q n := proj span{V>i, -,tp n } 

is given by 

n 

(6.5) Q« = ^hW4; 

and for each i, l?/^)^! is the projection onto the one-dimensional subspace Ctpi C 
W. 

Definition 6.4. Suppose X t is a stochastic process indexed by tin a finite interval 
J, and taking values in L 2 (fl, P) for some probability space (O, P). Assume the nor- 
malization E(X t ) = 0. Suppose the integral kernel E(X t X s ) can be diagonalized, 
i.e., suppose that 

J E(X t X s )tp k (s)ds = \ k <Pk{t) 
with an ONB {tp k ) in L 2 (J). If E(X t ) = then 

X t (w) = \fXk<f)k(t)Z k (w), w E fl 

k 

where E{Z j Z k ) = <5 jifc , and P(Z fc ) = 0. The ONB (<p fc ) is called the KL-basis with 
respect to the stochastic processes {X t : t E J}. 

Theorem 6.5. (See [JS07] ) The Karhunen-Lo eve ONB gives the smallest error 
terms in the approximation to a frame operator. 

Proof. Given the operator Tj which is trace class and positive semidefinite, we may 
apply the spectral theorem to it. What results is a discrete spectrum, with the nat- 
ural order Ai > A2 > ... and a corresponding ONB (ip k ) consisting of eigenvectors, 
i.e., 

(6.6) Tj<p k = X kVk , k e N 

called the Karhunen-Loeve data. The spectral data may be constructed recursively 
starting with 

(6.7) Ai = sup {(p\Tjtp) = (tpi\Tj(pi) 

<pGH,\\<p\\=l 

and 

(6.8) Xk+i = sup (tp\Tjip) = (tp k +i\Tj(p k +i) 

<p£H,\\<p\\=l 

Now an application of ArKa06j; Theorem 4.1 yields 

n n 

(6.9) Y,\ k >tr(QiTj) = J2(ili k \TjTp k ) for all n, 

fc=l k=l 

where is the sequence of projections from (|6.5p . deriving from some ONB (ipi) 
and arranged such that 

(6.10) (1>i\Tjih) > (HTjfi) > - ■ 

Hence we are comparing ordered sequences of eigenvalues with sequences of diagonal 
matrix entries. 
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Finally, we have 

oo oo 

tr (Tj) = A fc = J2^k\Tj^ k ) < oo. 

k=l k=l 

The assertion in Theorem 16.51 is the validity of 

(6.11) El < El 

for all (ipi) G ONB(H), and all n = 1,2, and moreover, that the infimum on 
the RHS in jHTTTJ) is attained for the KL-ONB (tp k ). But we see that (|67TT|) is 
equivalent to the system (|6.9j) in the Arveson-Kadison theorem. □ 

The Arveson-Kadison theorem is the assertion (|6.9[) for trace class operators, see 
e.g., refs [Arv06j and [ArKa06j . That (j6~TTj) is equivalent to (J6J]) follows fr om the 
definitions. 

Our next theorem gives Karhuncn-Locvc optimality for sequences of entropy 
numbers. 

Theorem 6.6. (See [JS07] ) The Karhunen-Loeve ONB gives the smallest sequence 
of entropy numbers in the approximation. 

Proof. We begin by a few facts about entropy of trace-class operators Tj. The 
entropy is defined as 

(6.12) S{Tj) := -tv{Tj\ogTj). 

The formula will be used on cut-down versions of an initial operator Tj. In some 
cases only the cut-down might be trace-class. Since the Spectral Theorem applies 
to Tj, the RHS in (|6TT2]| is also 

OO 

(6.13) S{Tj) = AfclogA*. 

fc=i 

For simplicity we normalize such that 1 = trTj = 53fe=i ^fcj an d we introduce the 
partial sums 

n 

(6.14) S% L (Tj) :=-^A fe logA fc . 

k=l 

and 

n 

(6.15) St(Tj) :=-J2(i>k\TjiP k ) log & k \Tji> k ). 

fe=i 

the form 



Let {ipi) G ONB(H), and set df 1 := {ipk\Tjipk)\ then the inequalities (|6.9p take 



(6.16) tr(Q^Tj) =^df <E A - n=1 ' 2 ' 

i=l i=l 

where as usual an ordering 

(6.17) 4>4> - 

has been chosen. 

li 



Now the function (3(t) := tlogt is convex. And application of Remark 6.3 in 
[ArKa06] then yields 



(6.18) 



^/3(df)<X>( A *)» n = l,2,... 



i=l z=l 

Since the RHS in (pHS)) is -tr(TjlogTj) = -S , J f L (T J ), the desired inequalities 
(6.19) SZ l (Tj)<S*(Tj), n=l,2,... 

follow, i.e., the KL-data minimizes the sequence of entropy numbers. □ 
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